Layered neural nets for pattern recognition - Acoustics, Speech and Signal Processing [see also IEEE Transactions on Signal Processing], IEEE Tr
نویسندگان
چکیده
Adaptive threshold logic elements called ADALINES can be used in trainable pattern recognition systems. Adaptation by the LMS (least mean squares) algorithm is discussed. Threshold logic elements only realize linearly separable functions. To implement more elaborate classification functions, multilayered ADALINE networks can be used. A pattern recognition concept involving first an “invariance net” and second a “trainable classifier” is proposed. The invariance net can be trained or designed to produce a set of outputs that are insensitive to translation, rotation, scale change, perspective change, etc., of the retinal input pattern. The outputs of the invariance net are scrambled, however. When these outputs are fed to a trainable classifier, the final outputs are descrambled and the original patterns are reproduced in standard position, orientation, scale, etc. I t is expected that the same basic approach will he effective for speech recognition, where insensitivity to certain aspects of speech signals and at the same time sensitivity to other aspects of speech signals will be required. The entire recognition system is a layered network of ADALINE neurons. The ability to adapt a multilayered neural net is fundamental. A new adaptation rule is proposed for layered nets which is an extension of the MADALINE rule of the 1960’s. The new rule, MRII, is a useful alternative to the back-propagation algorithm.
منابع مشابه
The use of cone-shaped kernels for generalized time-frequency representations of nonstationary signa - Acoustics, Speech and Signal Processing [see also IEEE Transactions on Signal Processing], IEEE Tr
Generalized time-frequency representations (GTFR's) which use cone-shaped kernels for nonstationary signal analysis are presented. The cone-shaped kernels are formulated for the GTFR's to produce simultaneously good resolution in time and frequency. Specifically, for a GFTR with a cone-shaped kernel, finite time support is maintained in the time dimension along with an enhanced spectrum in the ...
متن کاملInverse filtering of room acoustics - Acoustics, Speech and Signal Processing [see also IEEE Transactions on Signal Processing], IEEE Tr
A novel method is proposed for realizing exact inverse filtering of acoustic impulse responses in a room. This method is based on the principle called the multiple-inputloutput inverse theorem (MINT). Because a room impulse response generally has nonminimum phases, it has been impossible to realize exact inverse filtering of room acoustics using previously reported methods. However, the exact i...
متن کاملPitch detection with a neural-net classifier
Pitch detection based on neural-net classifiers is investigated. T o this end, the extent of generalization attainable with neural nets is first examined, and i t is shown t h a t a suitable choice of features is required t o utilize this property. Specifically, invaria n t features should be used whenever possible. For pitch detection, two feature sets, one based on waveform samples and the ot...
متن کاملClassification of emotional speech using spectral pattern features
Speech Emotion Recognition (SER) is a new and challenging research area with a wide range of applications in man-machine interactions. The aim of a SER system is to recognize human emotion by analyzing the acoustics of speech sound. In this study, we propose Spectral Pattern features (SPs) and Harmonic Energy features (HEs) for emotion recognition. These features extracted from the spectrogram ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004